Radiomics models based on multisequence MRI for predicting PD-1/PD-L1 expression in hepatocellular carcinoma

The purpose of this study was to explore the effectiveness of radiomics based on multisequence MRI in predicting the expression of PD-1/PD-L1 in hepatocellular carcinoma (HCC). One hundred and eight patients with HCC who underwent contrast-enhanced MRI 2 weeks before surgical resection were enrolled in this retrospective study. Corresponding paraffin sections were collected for immunohistochemistry to detect the expression of PD-1 and PD-L1. All patients were randomly divided into a training cohort and a validation cohort at a ratio of 7:3. Univariate and multivariate analyses were used to select potential clinical characteristics related to PD-1 and PD-L1 expression. Radiomics features were extracted from the axial fat-suppression T2-weighted imaging (FS-T2WI) images and the arterial phase and portal venous phase images from the axial dynamic contrast-enhanced MRI, and the corresponding feature sets were generated. The least absolute shrinkage and selection operator (LASSO) was used to select the optimal radiomics features for analysis. Logistic regression analysis was performed to construct single-sequence and multisequence radiomics and radiomic-clinical models. The predictive performance was judged by the area under the receiver operating characteristic curve (AUC) in the training and validation cohorts. In the whole cohort, PD-1 expression was positive in 43 patients, and PD-L1 expression was positive in 34 patients. The presence of satellite nodules served as an independent predictor of PD-L1 expression. The AUC values of the FS-T2WI, arterial phase, portal venous phase and multisequence models in predicting the expression of PD-1 were 0.696, 0.843, 0.863, and 0.946 in the training group and 0.669, 0.792, 0.800 and 0.815 in the validation group, respectively. The AUC values of the FS-T2WI, arterial phase, portal venous phase, multisequence and radiomic-clinical models in predicting PD-L1 expression were 0.731, 0.800, 0.800, 0.831 and 0.898 in the training group and 0.621, 0.743, 0.771, 0.810 and 0.779 in the validation group, respectively. The combined models showed better predictive performance. The results of this study suggest that a radiomics model based on multisequence MRI has the potential to predict the preoperative expression of PD-1 and PD-L1 in HCC, which could become an imaging biomarker for immune checkpoint inhibitor (ICI)-based treatment.

and induces T-cell apoptosis or dysfunction after it binds to PD-1, eventually leading to tumor immune escape 12 . Previous studies have shown that the expression status of PD-1/PD-L1 in tumours is associated with treatment responses and clinical outcomes following PD-1/PD-L1 pathway inhibition [13][14][15][16][17][18][19][20][21] and can be used as a biomarker for predicting the effectiveness of ICI treatment 22,23 . The main challenge lies in selecting the patient subgroup that would benefit most and avoid ineffective treatment resulting from blocking the PD-1/PD-L1 pathway. Therefore, it is crucial to evaluate the expression status of PD-1/PD-L1 in HCC patients before treatment. However, PD-1/ PD-L1 detection currently mainly depends on the immunohistochemical methods involving pathological tissue acquired from resection or biopsies. There is an urgent need to develop a noninvasive method to predict the PD-1/PD-1 expression status in HCC preoperatively.
In recent years, rapid developments of artificial intelligence have led to its playing an important role in personalized precision medicine. Radiomics is a new technology that can transform potential pathophysiological information in medical images into high-dimensional quantitative imaging features 24,25 . It can help with tumor classification and prediction by finding relationships between quantitative imaging features and clinical and genetic data. Magnetic resonance imaging (MRI)-based radiomics has been applied in many clinical areas 26,27 , but few studies have addressed the value of radiomics models based on multisequence MRI in preoperative predicting PD-1/PD-L1 expression in HCC patients. This paper mainly explores the effectiveness of the preoperative prediction of PD-1/PD-L1 expression status in HCC patients based on multisequence MRI radiomics features.

Materials and methods
Patients. The present study was conducted in accordance with the Declaration of Helsinki, and the requirement for informed consent was waived due to the retrospective nature of the study and the anonymous collection of data without any risk for the patient. The study was approved by the Ethics Committee of the Affiliated Hospital of North Sichuan Medical University (No. 2022ER013-1). The preoperative clinical, MRI, and pathological data of patients with postoperative pathologically confirmed HCC who underwent surgical resection at the Affiliated Hospital of North Sichuan Medical University from January 2018 to June 2021 were retrospectively analyzed. The inclusion criteria were as follows: (1) Postoperative pathologically confirmed HCC. (2) Multisequence MRI examination of the upper abdomen performed within 2 weeks before surgery. (3) No prior antitumor therapy. The exclusion criteria were as follows: (1) Incomplete data. (2) Maximum diameter of the lesion less than 2 cm. (3) Combined HCC and intrahepatic cholangiocarcinoma. The included patients were randomly assigned to the training group and validation group in a 7:3 ratio.
Immunohistochemistry. The expression of PD-1/PD-L1 was detected by immunohistochemistry. Pathological sections were independently evaluated by two doctors, and disagreements were resolved by discussion. Tonsil tissue was used as a positive control. In accordance with published methods, the results of PD-1/PD-L1 immunohistochemical staining were scored [28][29][30] . PD-1 expression was scored according to the percentage of positive cells and staining intensity. Positive staining was defined as light-yellow to dark-brown staining of the cell membrane or cytoplasm. The entire field of view of each slice was observed under a low-magnification microscope, and then six randomly selected fields of view with lymphocyte aggregation were read under high magnification (400×). The scoring scale for the proportion of positive cells was as follows: < 5%: 0 points; 5-24%: 1 point; 25-49%: 2 points; 50-100%: 3 points. The scoring scale for the staining was as follows: no staining: 0 points; light yellow: 1 point; light brown: 2 points; dark brown: 3 points. The average of the total scores (positive cells + staining) of the six fields of view was calculated. An average score of < 3 was deemed negative for PD-1 expression, and an average score of ≥ 3 was deemed positive for PD-1 expression 17,21 . PD-L1 expression was scored as the proportion of PD-L1 staining in tumor cells. Positive staining was defined as light-yellow to dark-brown staining of the cell membrane or cytoplasm; the proportion of tumor cells stained with PD-L1 was the percentage of stained tumor cells out of all tumor cells in the section. Positive expression was defined as a proportion of positively stained cells ≥ 1%.

MR image acquisition.
Scanning was performed using a Discovery 750 3.0-T superconducting MRI scanner (GE, USA). A 32-channel phased-array surface coil was used for scanning. All study subjects fasted for 4 h before the MRI scan and were taught breathing exercises. Scanning sequence: Axial fat suppression T2-weighted imaging (FS-T2WI), axial dynamic enhanced scanning 3D-LAVA sequence ( Table 1). The contrast agent used for dynamic enhancement was Gd-DTPA at a dose of 15-20 mL. A high-pressure syringe was used to inject the contrast agent through a vein on the back of the hand at a rate of 2-2.5 mL/s. The hepatic arterial phase, portal venous phase, and delayed phase were scanned after contrast medium injection. www.nature.com/scientificreports/ Tumor segmentation and feature extraction. The volume of the entire tumor was delineated layer by layer along the edge of the lesion as regions of interest on FS-T2W images and axial dynamic-enhanced images in the arterial phase and portal venous phase (Fig. 1). The radiomics features were extracted and divided into four categories: gray-level cooccurrence matrix (GLCM), gray-level run length matrix (GLRLM), intensity histogram, and shape. A dataset of different scan sequences features from FS-T2WI, arterial phase and portal venous phase images was generated. Interobserver agreement was tested on the results recorded by two radiologists (observers 1 and 2) (2 and 5 years of experience, respectively) as a test indicator. The intergroup correlation coefficient (ICC) was used to assess interobserver agreement. When ICC ≥ 0.75, the two observers had good consistency.

Feature screening and model establishment.
In the first step, to eliminate the exponential dimensional differences among the data, all the data were standardized by the z score normalization method. In the second step, features with an ICC < 0.75 were eliminated. The dataset generated by each sequence needed to be checked for consistency. In the third step, the screened stable features were analyzed by a single-factor statistical analytical method (the independent-sample t test or Mann-Whitney U test was chosen according to the characteristics of the data distribution). Stable features with statistically significant differences in PD-1/PD-L1 expression were selected (P < 0.05). In the fourth step, to avoid overfitting, least absolute shrinkage and selection operator (LASSO) regression analysis was used to select the core radiomic features predicting PD-1/PD-L1 expression. Using the minimum criterion (1 minus standard error), the regularization parameter (λ) of the selected features was adjusted by tenfold cross-validation. The optimal radiomics features selected from each sequence were used to construct a single-sequence and multisequence radiomics prediction models by logistic regression 31,32 . The radiomic-clinical model was constructed by combining multisequence radiomics features and clinical characteristics. The training set data were used to train the model, and the validation set data were used to validate the model. The predictive performance of the models was evaluated by calculating the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1-score of the confusion matrix.
Statistical analysis. R software (version 4.0.2. https:// www.r-proje ct. org/) was used for the statistical analysis in this study. The R packages used included "psych", "glmnet", and "pROC". "psych" was used to assess the intergroup agreement for the radiomics characteristics; "glmnet" was used to perform LASSO regression analysis; "pROC" was used to draw the ROC curves. Quantitative data are described as the median. The Shapiro-Wilk test was used to judge distribution normality for these variables, and the Bartlett test was used to judge homogeneity of variance. When both tests were satisfied, the independent-sample t test was used; otherwise, the Mann-Whitney U test was used for comparisons between groups. Categorical variables are described as percentages, and the chi-squared test was used for comparisons between groups. A two-tailed P value < 0.05 was considered statistically significant.

Results
In all, 147 patients were considered for enrollment, and of these, 108 patients met the criteria and were included in this study (Fig. 2). Among the 108 enrolled patients, 95 were male and 13 were female; 81 patients had liver cirrhosis, and 37 patients were diagnosed with multiple tumors. The maximum tumor diameter ranged from 2.0 to 20.1 cm. Positive PD-1 expression was observed in 43 patients, while negative PD-1 expression was observed in 65 cases.A total of 34 patients had PD-L1-positive expression, and 74 had PD-L1-negative expression. Among the clinical characteristics, the presence of satellite nodules served as an independent predictor of PD-L1 expression ( Fig. 3; Tables 2, 3). www.nature.com/scientificreports/ We extracted 352 features from the FS-T2WI, arterial phase, and portal venous phase datasets, and features with an ICC score lower than 0.75 were excluded. The remaining features were further analyzed (Fig. 4). In the analysis of PD-1 expression status, there were 221, 333, and 331 features in the FS-T2WI, arterial phase, and portal venous phase datasets, respectively, which were significantly different according to the independent-sample t test or Mann-Whitney U test (P < 0.05). LASSO regression selected two, six, and five optimal features from the statistically significant radiomics features, respectively (Fig. 5). In the analysis of PD-L1 expression status, according to the independent-sample t test or Mann-Whitney U test, there were 221, 326, and 344 features in the  www.nature.com/scientificreports/ FS-T2WI, arterial phase, and portal venous phase datasets, respectively, that were significantly different between the PD-L1-negative and PD-L1-positive status groups (P < 0.05). LASSO regression selected two, four, and six optimal features from the statistically significant radiomics features, respectively (Fig. 6).
In the analysis of PD-1 expression status, as shown in the above steps, two, six, and five features screened from the FS-T2WI, arterial phase and portal venous phase datasets, respectively, were used to construct the FS-T2WI, arterial phase and portal venous phase radiomics models. These features were also synthesized to construct the multisequence model. The predictive performance of the models was evaluated by the AUC, sensitivity, specificity, PPV, NPV, accuracy, and F1-score. The AUC values of the four radiomics models FS-T2WI, arterial phase, portal venous phase, and multisequence were 0.696, 0.843, 0.863, and 0.946, respectively, in the training group   Fig. 7).
In the analysis of PD-L1 expression status, the two, four, and six features screened from the FS-T2WI, arterial phase, and portal venous phase datasets, respectively, in the above steps were used to construct the FS-T2WI, arterial phase, and portal venous phase radiomics models. These features were then synthesized to construct the multisequence model, and the predictive performance of all models was evaluated using the above metrics. The AUC values of the FS-T2WI, arterial phase, portal venous phase, multisequence and radiomic-clinical models were 0.731, 0.800, 0.800, 0.831 and 0.898, respectively, in the training group and 0.621, 0.743, 0.771, 0.810 and 0.779 in the validation group. The combined models had better predictive performances (Table 5, Fig. 8).

Discussion
The PD-1/PD-L1 pathway plays a key role in the development of chronic liver infection, tumor immune response evasion, and tumor microenvironment formation 33 . Previous studies have indicated that the overexpression of PD-1/PD-L1 in HCC patients is closely related to their poor prognosis and tumor recurrence 14,17,[34][35][36][37][38][39][40] ; PD-1/ PD-L1 expression may serve as a biomarker for predicting ICI treatment response in HCC patients 13,33,36,41,42 . Radiomics extracts high-dimensional data from traditional medical images that cannot be assessed by the naked eye and has a strong correlation with heterogeneity at the cellular level 43 . Recently, radiomics has been applied to the analysis of PD-1/PD-L1 expression in lung cancer, breast cancer, etc. 30,44-50 , but has rarely been applied to studies of the PD-1/PD-L1 expression status in HCC 51-53 . Tian et al. 52 extracted radiomics and deep learning features based on preoperative T2WI sequences and used an integrated model to predict the expression of PD-L1 in HCC tissues. The results showed that the AUC of the radiomics-based model was 0.794 ± 0.035; the model combining radiomics and deep learning features achieved the best predictive performance, with an AUC value of 0.897 ± 0.084. However, they only studied a single sequence, T2WI, and did not incorporate other sequences, such as contrast-enhanced MRI. In the present study, we established a multisequence MRI-based radiomics model by integrating the radiomic features from FS-T2WI, contrast-enhanced arterial-phase, and Because different sequences reveal different information about the tumor, the multisequence combined radiomics model had the best predictive performance. The conclusion of the present study is consistent with the literature [54][55][56][57] .
In this study, 13 and 12 core radiomic features were extracted from FS-T2WI, arterial-phase, and portal venous-phase MRI images to construct radiomics models for predicting PD-1 and PD-L1 expression, respectively.   60 converted MRI radiomics features of liver tumors into a quantitative Radscore for the preoperative prediction of PD-1/PD-L1 expression, and they found that the radiomics features associated with PD-1/PD-L1 expression were mainly GLCM features. Second, morphological features were also closely related to the expression of PD-L1. Max3Ddiameter and SurfaceAreaDensity are the most relevant of these features: Max3Ddiameter is the longest 3D diameter of the tumor mass, and     37 showed that PD-L1 overexpression was significantly associated with tumor differentiation, history of hepatitis, elevated alpha-fetoprotein (AFP), and tumor-infiltrating lymphocytes and was not significantly associated with the maximum tumor diameter (P = 0.07) or tumor number (P = 0.54). A meta-analysis by Zhang et al. 67 showed that PD-1 expression was significantly correlated with age (P = 0.023) and AFP (P = 0.000). Among all the clinical factors analyzed in this study, only the maximum tumor diameter and tumor number were associated with PD-L1 expression, which is consistent with the findings of Hu et al. 65 . Multivariate analysis revealed that the presence of satellite nodules was an independent predictor of PD-L1 expression. No other clinical factors were found to be associated with PD-L1 expression, nor was PD-1 expression found to be associated with any relevant clinical indicator. This study has the following limitations. First, the sample was small. Because many HCC patients who did not undergo surgical resection or MRI scans were excluded, there may be potential selection bias. Second, the study used data from a single center. The results must be externally validated in other centers. Third, other MRI sequences, such as diffusion-weighted imaging, were not analyzed in this study, and therefore, information from other sequences was ignored. Finally, this study did not develop a prediction model including genetic variables. A combined model for predicting PD-1/PD-L1 expression should be constructed by combining multisequence radiomics features and clinical and genetic characteristics in the future.
In summary, the radiomics model based on multisequence MRI has potential in predicting the preoperative expression of PD-1 and PD-L1 in HCC, which could become an imaging biomarker for ICI treatment.

Data availability
All data generated or analyzed during this study are included in this published article and its supplementary information files.